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CN , We consider multiple description coding for the Gaussian source with K descriptions under the sym- 

metric mean squared error distortion constraints, and provide an approximate characterization of the rate 
region. We show that the rate region can be sandwiched between two polytopes, between which the gap 
can be upper bounded by constants dependent on the number of descriptions, but independent of the exact 
Q I distortion constraints. Underlying this result is an exact characterization of the lossless multi-level diversity 

source coding problem: a lossless counterpart of the MD problem. This connection provides a polytopic 
template for the inner and outer bounds to the rate region. In order to establish the outer bound, we gener- 
alize Ozarow's technique to introduce a strategic expansion of the original probability space by more than 
one random variables. For the symmetric rate case with any number of descriptions, we show that the gap 
\Q . between the upper bound and the lower bound for the individual description rate is no larger than 0.92 bit. 

I The results developed in this work also suggest the "separation" approach of combining successive refine- 

ment quantization and lossless multi-level diversity coding is a competitive one, since it is only a constant 
^ . away from the optimum. The results are further extended to general sources under the mean squared error 

Q I distortion measure, where a similar but looser bound on the gap holds. 

> '■ 

^ 1 Introduction 

In the multiple description (MD) problem, a source is encoded into several descriptions such that any one 
of them can be used to reconstruct the source with certain quality, and more descriptions can improve the 
reconstruction. The problem is well motivated by source transmission over unreliable network and distributed 
storage systems, since there exists uncertainty as to which transmissions are received successfully (or which 
servers are accessible) by the end user. 

In the early works on this problem, for example [1, 2], only two descriptions are considered. Even in 
this setting, the quadratic Gaussian problem is the only completely solved case [2], for which the achievable 
region in [1] is tight. Through a counter-example, Zhang and Berger showed that this achievable region is 
however not tight in general [3], and a complete characterization of the rate-distortion (R-D) region has not 
been found to this date. See [4] (and the references therein) for a review of works related to this problem in 
the information theory literature. 

Recent research attention has shifted to the general fC-description problem, partly motivated by the avail- 
ability of multiple transmission paths in modern communication networks. In [5] [6], an achievable individual 
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description rate was provided for symmetric multiple descriptions, where each description has the same rate, 
and the distortion constraint depends only on the number of descriptions available. This achievable region is 
based on joint binning of the codebooks for each description, which has a similar flavor as the method often 
used in distributed source coding problems. Another achievable region was given in [7] using more conven- 
tional conditional codebooks. Wang and Viswanath [8,9] generalized the Gaussian MD problem to vector 
Gaussian source with many descriptions, and tight sum rate lower bound was established for certain cases 
with only two levels of distortion constraints (see also the outer bound result in [7]). 

In this work, we consider general multiple description coding with K descriptions under symmetric dis- 
tortion constraints. The distortion constraints are symmetric in the sense that with any k < K descriptions, 
the reconstruction has to satisfy the distortion Dk, regardless of which specific combination of k descriptions 
is used. Though the distortion constraints are symmetric, the rates of the descriptions are not necessarily the 
same in this setting, thus generalizing the case treated in [5] [6]. Nevertheless the completely symmetric case 
as considered in [5] [6], i.e., with both symmetric rate and distortion constraints, is indeed an interesting spe- 
cial case, and will be treated with particular care. Our main focus is on the Gaussian source under the mean 
squared error (MSE) distortion constraint, however we also show that the results can be extended to more 
general sources under the same distortion measure. 

Though completely characterizing the rate-distortion region of the Gaussian multiple description problem 
is difficult if not impossible, we provide an approximate characterization. Underlying this approximation is 
the lossless symmetric multi-level diversity (MLD) coding problem previously studied in [15, 16]; see Fig. 
[U The MLD coding problem can be interpretted as a lossless version of the MD problem, and thus one of 
our main insights is to use the MLD rate region as a polytopic template for inner and outer bounding the MD 
rate-distortion region. We show that the MD rate-distortion region can be sandwiched between two polytopes, 
between which the gap can be upper bounded by constants dependent on the number of descriptions, but 
independent of the exact distortion constraints. The MD coding system is illustrated in Fig. [T]for = 3 
together with the MLD coding system. 

One of the main contributions of this work is a novel lower bound to the sum rate for the Gaussian source, 
under K levels of symmetric distortion constraints. This generalizes previous results in [2, 8, 9], where only 
two levels of distortion constraints are enforced in the system. Though the lower bound given here may not 
be tight, it is the first provably good bound with more than two levels of distortion constraints enforced, to 
the best of our knowledge. We derive this lower bound by generalizing Ozarow's technique in treating the 
Gaussian two-description problem. More specifically, we expand the probability space of the original problem 
by more than one auxiliary random variables, and impose certain Markov structure on these random variables. 
Ozarow's technique has been applied to various problems besides the MD problem [2,7-9], for example, the 
results on multi-terminal source coding by Wagner and Anantharam [10], and the joint source channel coding 
problem with bandwidth expansion by Reznic et al. [11]. However, in all these previous works the probability 
space is expanded by only one additional auxiliary random variable (in [8, 9] it is one additional auxiliary 
random vector since vector source was being considered). Recently a similar technique has also been applied 
to the Gaussian interference channel problem [12], and interestingly the results there indeed require expanding 
the probability space by more than one random variables. The MD sum rate lower bound given in our work 
can be optimized over K — 1 variables to provide the tightest bound. However an explicit solution for this 
optimization problem appears difficult, thus instead we choose a specific set of values to provide a suboptimal 
lower bound, which nevertheless still offers insight on the problem and allows us to give an approximate 
characterization of the MD rate region. 

For the inner bounds, we analyze two achievability schemes: the first is a very simple scheme based 
on successive refinement coding [13, 14] coupled with multi-level diversity coding (SR-MLD) [15-18]; the 
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Figure 1: MD and MLD coding system diagrams for = 3. More details on MLD coding are given in the 
next section. 



second is a generalization of the multilayer coding scheme proposed by Puri, Pradhan and Ramchandran 
[5] [6], which we will refer to as the PPR multilayer scheme. In the special case of symmetric rate, the 
first scheme reduces to the well-known unequal loss protection method [20], and we thus also refer to it as 
the SR-ULP scheme. The SR-MLD (or SR-ULP) scheme is in fact a separation-based scheme where the 
quantization step and lossless source coding step are performed separately. As illustrated in Fig. [3l the output 
of a successive refinement code is cascaded with the lossless multi-level diversity coding scheme. 

The generalization of the second scheme of [5] [6] has two aspects: we first show that the definition of 
the symmetric distribution, over which the scheme is optimized, can be relaxed straightforwardly; secondly 
by introducing additional coding component and invoking results on a-resolution, we establish an achievable 
region that matches the polytopic template of MLD coding rate region. Interesting, the achievable rate region 
under a fixed set of auxiliary random variables is not a contra-polymatroid, unlike those often seen in other 
multi terminal source coding problems. 

With the inner and outer bounds, we quantify the difference between them. For the symmetric rate prob- 
lem, the individual-description rate-distortion (R-D) function can be bounded within a constant depending 
only on the number of descriptions, but not the distortion constraints. Moreover, regardless of the number of 
descriptions, the gap between the lower bound and the upper bound using the SR-ULP coding scheme is less 
than L48 bits, and for the PPR multilayer scheme, the gap is less than 0.92 bit. In order to establish these 
results, method similar to the enhancement technique in [19] is used. We also generalize the results to other 
sources under the mean squared error constraints, and show the sum rate gap between lower and upper bounds 
can be bounded within a constant, depending also only on the number of descriptions. 

In addition to providing an approximate characterization of the symmetric individual-description R-D 
function, we also consider the rate region under symmetric distortion constraints. We first illustrate the 
basic ideas explicitly by considering the three-description case, and then extend the result to the general 
/^-description problem. For the three-description case, we show that the outer and inner bounds can be repre- 
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Figure 2: Bounding the rate distortion region for the three-description case, where the distances between 
corresponding planes of the inner and outer bounds are measured by Euclidean distance. The inner bound is 
drawn in with dashed lines, and the outer bound with solid lines. 
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Figure 3: The separation approach based on successive refinement and lossless multi-level diversity coding. 

sented by ten planes with matching normal direction, and the Euclidean distances between the corresponding 
planes are shown to be less than certain small constants; these results are illustrated in Fig. |2l Then using the 
a-resolution approach introduced in [16], we show that for the general iC-description Gaussian problem under 
symmetric distortion constraints, the bounding planes of the rate region can be bounded both from above and 
below, between which the gap is bounded, and subsequently provide an approximate characterization of the 
R-D region. 

It is surprising that the simple separation-based scheme of combining successive refinement and lossless 
multi-level diversity coding is able to achieve performance only a constant away from the optimal scheme; see 
Fig. [3]for the illustration of this system. This result implies that in certain practical high rate applications, this 
simple scheme may be sufficient, since additional gain will require much more complicated system design, 
and the resulting system will be significantly less flexible. Moreover, when distortion constraints are placed 
only on the last k levels for the decoders with K — k + l,K — k + 2, ...,K descriptions, we show that even the 
gap between the lower and upper bound on the sum rate is asymptotically diminishing when the total number 
of descriptions K becomes large with k fixed. Thus virtually no gain is possible even in terms of sum rate for 
this case. 

We emphasize that the general approach used in approximating the MD rate-distortion region is likely 
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to provide insightful result for other network source (and channel) coding problems. More precisely, even 
though the exact rate-distortion region (or capacity region) of a multiuser information theory problem may 
have a general convex shape with a curvy boundary, simple polytopic inner and outer bounds are likely to exist 
which can provide a good approximate characterization. Comparing to general bounds, polytopic bounds are 
much easier to analyze. To apply this approach, it is desirable that the inner and outer bounds both follow a 
"common template" such that they can be conveniently compared. The result in our work suggests that a good 
choice of the template for a rate-distortion problem is the underlying lossless compression problem. 

The rest of the paper is organized as follows. In Section [2] we provide a formal definition of the problem, 
and then briefly review the multi-level diversity problem and the a-resolution method. In Section[3l we present 
a set of simplified results for the case with three descriptions as an illustrative example. Section |4] summarizes 
the main results of the paper. In Section [5l we focus on deriving the upper and lower bounds for the sum rate, 
and in Section [6l the inner and outer bounds for the rate region are presented. Finally Section |7] concludes the 
paper. Detailed and technical proofs are given in the appendices. 

2 Notation, problem formulation and review 

In this section we first provide the necessary notations and the problem definition, then briefly review the 
multi-level diversity coding problem and some essential a-resolution results [15, 16] which play an important 
role in the development of our results. Wherever the notations or definitions become less transparent, we will 
specialize them to the three description case, i.e., the case K = 3. This special case will continue to serve as 
our working example, particularly in Section [3l 

2.1 Notation and problem definition 

Let {X(z)}j=i 2,... be a memoryless stationary source. At each time index i, the random variable X{i) in an 
alphabet X is governed by the same distribution law /ix- In most of this work, we assume X = M, i.e., the 
real alphabet; moreover the reconstruction alphabet is also usually assumed to be M. We use M_,_ to denote 
the set of non-negative reals. The vector X(l), X(2), ...,X(n) will be denoted as X". Capital letters are used 
for random variables, and the corresponding lower-case letters are used for the realization of these random 
variables. Let d : X x X [0, oo] be a single-letter distortion measure, and the multi-letter extension is 
defined as 



In this work, we are particularly interested in the squared error distortion measure d{x, y) = (x — y)^. As such, 
it will be assumed without loss of generality that the source has a normalized unit variance. In this context, 
the most important case is the zero-mean unit-variance Gaussian source X ~ A/'(0, 1). In fact for the majority 
of this work we shall only consider the Gaussian source, except stated otherwise explicitly. 

We shall adopt most of the notations in [16] introduced for the multi-level diversity coding (MLD) prob- 
lem, which can be understood as a special case of the multiple description problem as we shall explain shortly. 
Throughout the paper, boldface letters are used to denote i^- vectors. For the general A'-description problem 
being considered, a length-n block of the source samples is encoded into K descriptions. Let v be a vector in 
{0, 1}^, and denote the i-th component of vhy Vi. Define 




n 



(1) 



i=l 



{v e {0,ir ■.\v\=a}, a=l,2,...,K 



(2) 
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where | v| is the Hamming weight of v, and define = Uf-i ^k- Essentially, the set Q,k provides a compact 
way to enumerate the possible combinations of the descriptions, or equivalently a compact way to enumerate 
the possible decoders. Particularly for the case of K — 3, we have 

n3 = nlunlunl = {loo, oio, 001} u {110, 101, 011} u {111}. (3) 

Decoder v, v e Hk has access to the \v\ descriptions in the set Gv — {i : Vi — 1}. For the case = 3, we 
have 

Gioo = {1}, Goio = {2}, Gooi = {3}, Gno = {1, 2}, Gioi = {1, 3}, Gon = {2, 3}, Gni = {1, 2, 3}. (4) 

The symmetric distortion constraints are given such that any decoder v can reconstruct the source to satisfy a 
certain distortion D^^i-, i-C, the distortion constraint depends only on the number of descriptions the decoder 
has access to, but not the particular combination of descriptions. 

Formally, the problem is defined as follows. An (n, (Mj, i e I^), {^V:^ e ^k)) code, where Ik — 
{1, 2, K}, is defined as 

Sr.x^^iM,: ielK (5) 

TV-UieGy^M.^X-, VEQk: (6) 

and 
where 

XIl,^Tv{Si{X"),ieGv), (8) 

and E is the expectation operator. For the case K — 3, we have three encoders Si{-), S2{-) and 5'3( ), and 
seven decoders Tioo, Toio, Tqoi, Tiio, Tioi, Ton and Tm, each decoder being associated with a reconstructed 
source sequence and inducing an expected distortion A^;. 

A i^-tuple R2, Rk) is (-Di, -D2, -Dx)-admissible if for every e > 0, there exists for sufficiently 
large n an (n, (Mj, i e Ik), (A-y, v e ^k)) code such that 

-\ogMi<Ri + e, ielK, (9) 
n 

and 

Av<D\v\+e, ve^K. (10) 

Throughout the paper, we use logarithm of base 2, such that the rate is measured by bits. Let 7^(-D) be the 
collection of all £)-admissible rate vector, and this is the region of interest in this work. In the following 
sections, we shall assume 1 > -Di > -D2 > ■ ■ ■ > Dk > without loss of generality. One important special 
case is when the rates of the all the descriptions are the same, i.e., Mi — M for any i e Ik- For this symmetric 
rate case, the symmetric individual-description rate distortion (R-D) function R{D) is defined simply as 

R(D) = inf R. (11) 

{R:R>Ri,{Ri,R2,...,RK)e'R'{D)} 
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Figure 4: The similarity between MD coding and MLD coding for K = 3. They are essentially the same 
source coding problem with different distortion criteria. 



Since TZ{D) is a closed set, the infimum can in fact be replaced by a minimum. Though in (fTTl) we do 
not explicitly enforce the constraint that Ri = R2 = ■■■ = Rr, it is straightforward to see this constraint 
can be added without causing any essential difference. For the case = 3, we often expand the distortion 
vector and write the rate region and symmetric individual-description rate distortion function (SID-RD) as 
TZ{Di, D2, D3) and R{Di, D2, Ds), respectively. 

Throughout the paper, when a rate R is of interest, we use Ror Rto denote its inner (upper) bounds, and 
use R to denote its outer (lower) bound; when rate region TZ is of interest, similar convention is taken. 



2.2 A brief review of the symmetric multi-level diversity coding problem 

The symmetric MLD coding problem considered in [15, 16] can be described as follows. A total of K in- 
dependent sources Vi, V2, Vfc are observed at the encoder, and encoded into K descriptions. A decoder 
Tv, which is called a level- l^l decoders, should reconstruct Vi, V2, V\v\ losslessly in the Shannon senseQ. 
Particularly in the case of = 3, three independent sources Vi, V2 and V3 are observed at the encoder, and 
encoded into three descriptions. The first level decoders Tioo, Tqio and Tqoi should reconstruct Vi losslessly, 
the second level decoders Tuo, Tioi and Ton should reconstruct (Vi, V2), and the third level decoder Tm 
should reconstruct (Vi, V2, V3). The connection 

In the framework of MD coding afore-introduced, we can simply treat the multi-source Vi,V2, .■■,Vk as 
the single super source X, and the distortion measure d^v\i-, ■) is level-dependent, and thus also decoder- 
dependent, which is simply a Hamming distortion measure operating only on Vi,V2, V\v\- Therefore the 
lossless symmetric MLD coding problem essentially provides the solution to this symmetric MD problem 
at an extreme point of zero distortions for discrete memoryless sources; Fig. \T\ and Fig. |4] illustrate the 
connection between the two problems in terms of the encoding/decoding functions and the distortion measure, 
respectively. 

The main result for the symmetric MLD coding problem in [15, 16] is that source separation coding^ is 
in fact optimal for this problem. The source separation coding scheme and the corresponding region can 
be described as follows. Each source vector is encoded independently of the other sources, and the i- 
th description is allocated rate rf for the a-th source source V^. Each description is then the collection of 
encoded information (codes) produced for all the sources. The rate region is thus the set of non-negative rate 

'it can be shown that lossless in the Shannon sense and lossless with diminishing Hamming distortion does not cause essential 
difference. 

^This coding scheme was originally called superposition coding, but here we adopt the name source separation coding as 
suggested by Raymond Yeung, in order to avoid confusion with the superposition coding in broadcast channel. 
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vectors R that satisfy the following condition [16] 

K 

R^ = J2r:, z = l,2,...,K, (12) 

0=1 

for some rf > 0, a = 1,2, K such that 

> H{V\v\), V e Qk. (13) 

The collection of information in all the descriptions may be redundant for any given source Va, a < K, 
though any given specific description is maximumly compressed by itself. Clearly, the equality in (fT2l) can 
be replaced by > without loss of generality. As pointed out in [17], the condition (fT3]) has an interpretation 
closely related to Slepian-Wolf coding, that the source words are randomly binned (for the i-th description) 
with rate r", such that the source vector can be recovered as long as the sum rate from any a descriptions 
for this source is larger than H{Va). In [17], a connection to the maximum distance separable (MDS) codes 
was used to prove this result. Indeed, the Slepian-Wolf interpretation and the MDS codes interpretation are in 
fact closely related in this setting. 



2.3 Review of the a-resolution results 

The rate region characterization (fT2l) and (fT3]) for MLD coding problem is given in a parametrized form, i.e., 
involving variables more than the rate tuple of interset R2, .., -Rr)- Though for smaller value of K, e.g., 
= 3, it is possible to explicitly investigate the faces and vertex points of the rate region, for larger value of 
K this becomes intractable. To overcome this difficulty, the a-resolution method was invented in [16] to reveal 
the inherent structure of the MLD coding rate region. Next we directly quote a few definitions and results from 
[16]; some further results will be given after related notations are properly introduced. The readers in their 
initial reading may skip the lemmas and theorem in this subsection, and they will not be needed until Section 

m 

Let u and v be two vectors in M^. Define u > v if and only if Ui > Vi,'ii E Ik- Similar notation holds 
for u,v e {0, 1}^. For any A = {Ai, A2, Ak) > 0, a mapping Cq : M+, where M+ is the set of 

non-negative real numbers, satisfying the following properties 

Caiv)>0, for all ven%, (14) 

and 

J2 < A (15) 

is called an a-resolution for A; it will be denoted as {00(1;)} or simply as Cq,. Define a function : M;^ M+ 
for a G Ik by 

fa{A) = max 
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where the maximum is taken over all the a-resolution of A. If {ca{v)} achieves /o(A), then it is called an 
optimal a-resolution for A, or simply a-optimal. Without loss of generality up to a permutation of the rate 
vector components, we may assume 

Ai>A2> ... > Ak. (17) 
Defintion 2.1 Let {ca{v)} be an a-resolution of A, then J^veff^ Ca{v)v is called the profile of{ca{v)}. 

Lemma 2.1 ([16], Lemma 1) Let {ca{v)} be a-optimalfor A, and let (Ai, A2, A^) be its profile. If there 
exist 1 < i < K such that Ai — Ai> 0, then Ca{v) > implies vi = 1. 

Lemma 2.2 ([16], Lemma 2) Let {ca{v)} be a-optimal for A, and let {Ai, A2, Ak) be its profile, then 
there exists < /q, < a — 1 such that Ai — Ai > if and only ifl<i<la- 

Defintion 2.2 For 2 < a < K, let Cq and Ca-i be a-optimal and {a — l)-optimalfor A, respectively. Then 
Cq_i covers Ca, denoted by Ca-i >- Ca, if 

c,-i(u)i/(5„zeG'n)> ^ c,(^;)i^(5„^GG^;), (18) 

for any K jointly distributed random variable Si, S2, Sk- 

The following lemma is straightforward with the above definitions. 

Lemma 2.3 Let Ca-i and Ca be {a — l)-optimal and a-optimal, respectively. If Ca-i >- Ca, then {a — 
l)/a-i(A) > af^{A). 

Proof 1 (Proof of Lemma 12.31 ) Let Si, S2, Sk be independently and identically distributed random vari- 
ables with entropy H{Si) = H{S) > Ofor any i G Ik, then it follows 

l)f^^i{A)H{S)= c^-i{u)ia-l)H{S)= ^ c^-i{u)H{S„i e Gu) 



[a 



> J2 c^{v)H{S,,teGv)= Y Caiv)aHiS) = aUA)HiS). (19) 

Dividing both ends by H{S) completes the proof. 

By using Lemma [23] and the definition of fa{A), the following lemma is rather immediate. 
Lemma 2.4 The follows are true. 

• The optimal 1-resolution is unique, Ci{v) = Aifor Gv = {i}. Moreover fi{A) = J2k=i ^* ~ Asum- 

• The optimal K -resolution is unique ck{v) = JkIA) = minig/^ where Gy = Ik- 

• For any a such that K > a > 2, fa{A) < 

The following theorem is instrumental for the result presented in [16], and it is also important for us to 
establish the result on the MD R-D region for K > 

Theorem 2.1 ([16], Theorem 3) For any A > 0, there exist Ca, I < a < K, where Cq, is a-optimal for A, 
such that 

ci ^ C2 ^ ... >- Ok- (20) 
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3 A simple approximation for K = 3 



In this section we give a set of approximate characterization of the SID-RD function and the rate-distortion 
region for the three description case. For the sake of simplicity, only the simple SR-MLD scheme is consid- 
ered, and subsequently the set of results in this section is not as strong as those given in the following sections, 
however we choose to present them first for better exposition. The approximate rate-distortion region charac- 
terization for this case has a more explicit algebraic form, and can also be illustrated pictorially, which suites 
particularly well for the purpose of facilitating understanding. Moreover, as we shall show, even this set of 
simple results in fact provides a quite good approximation for the three description case. 



3.1 Approximating the symmetric individual description rate distortion function 
3.1.1 A simple upper bound 

For the symmetric rate case, the source separation coding scheme reduces to the following simple unequal loss 
protection scheme; see, for example, [20]. Sources Vi, V2, V3 are losslessly compressed independent of each 
other. The encoded Vi is repeated in all three description; a (3, 2) maximum distance separable (MDS) code 
is applied to the encoded V2 bitstream, and the resulting codeword is evenly split into each description; the 
encoded V3 is then evenly split into each description without additional coding. This simple scheme clearly 
has the symmetric individual description rate of H{Vi) + ^H{V2) + ^H{Vs). 

For the MD problem, consider now constructing the bitstream Bi using the i-th layer of a successive 
refinement code for the Gaussian source, to satisfy the distortion constraint Di, for i = 1,2, 3. This coding 
structure is illustrated in Fig. [3l where the i-th layer output is taken to be the random source Vi. Since the 
quadratic Gaussian source is successively refinable [13], the following rate of is achievable 



1 , A-i 



1,2,3. 



(21) 



where Dq = 1. 

With Bj playing the role of the source vector V^, it is clear that the following individual description rate 
is achievable, which provides a simple upper bound on the SID-RD function (defined in ([TTI) ) 



R{D,,D2,D^ 



, 11, ^1^1, D2 
lOff — log — log 



12 DfD^Dl 



(22) 



3.1.2 A simple lower bound 

Next we consider lower bounding the sum rate. To do this we write the following chain of inequalities. 

n{Ri + R2 + R3) 

> H{Si) + H{S2) + HiS^) - i7(Si5253|X") 

His,) + HiS2) + HiSs) - HiS,S2S,\X^) - ^ [H{S,S2) + //(^s^s) + ^(^1^3)] 
+ ^ [HiS,S2) + H{S2S3) + HiS.Ss)] - HiS,S2Ss) + HiS^S2S,) ^ H^, 



(23) 
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where (a) is because Si, i = 1,2, 3, are deterministic functions of X"; (b) is by adding and subtracting the 
same term. This step may appear rather arbitrary, however a closer look reveals that the terms bear similarity 
to Han's inequality on subsets of random variables [22]. 

Next define Y2 = X + N2 and Yi = X + Ni + N2, where A^^i and N2 are mutually independent Gaussian 
random variables, also independent of the Gaussian source X, with variance af and respectively. Define 
di = al + cr| and ^2 — whose values are to be chosen later. The following step is essential for establishing 
the lower bound, which differs significantly from the technique of [2] and [8] in that we now utilize the two 
auxiliary random variables Yi and Y2. Consider the following quantity 

Hs = + H{S2\Y;') + H{S3\Y,^) - ^ [H{SiS2\Y,^) + H{S2S3\Y,^) + H{SiS,\Y,^)] 

+ 1^ [HiS,S2\Y2n + HiS2Ss\Y2n + H{S,Ss\Y2n] - 52531^2")} • (24) 

It is seen that H3 > 0, because each brace in (|24|) is nonnegative by applying the conditional version of 
Han's inequality Intuitively, we expect certain conditional independence to hold approximately such 

that each brace is approximately zero. In this sense, the first brace roughly suggests that Yi is approximately 
a reconstruction with only (and any) one description, such that the individual descriptions are independent 
given Yi, the second brace roughly suggests that Y2 is approximately a reconstruction using only (and any) 
two descriptions, such that pairs of descriptions are independent given Y2. Then it follows 

n{Ri + R2 + R3) >Hs-H3 

= I{S,; Fi") + I{S2; Y^ + I{Ss; n") + ^[I{S,S2; Y^) - I{S,S2; Y^)] 

+ [I{SiS2S:,- X") - I{SiS2S3- Y;% (25) 

If i^s is close to zero, then the bounding above should yield meaningful result, which is indeed the case. 
We need the following lemma to proceed, the proof of which is in Appendix [81 Note that this lemma is not 
limited to the case of = 3. 

Lemma 3.1 Let Si, i E Ik be a set of encoding functions such that there exist decoding functions to satisfy 
the distortion constraints D = {Di, D2, D^)- Let Y5 = X + X^ and Ya = X + Na + N^, where Na and Nh 
are mutually independent Gaussian random variables independent of the Gaussian source X, with variance 
a1 and al, respectively. Then by defining al = db and 0^^ + c^l = da, we have 

L Mutual information bound between encoding functions and a noisy source 

m, leGv; Y:) > - log , (26) 

Z JJ\D\ + da 

2. Bound on mutual information different between encoding functions and different noisy sources 

/(S„ i^G,; K) - /(S„ . e G„; > log t twn"" 1 ' <2^> 

Z [i + aa)[V\v\ + db) 



^One can also optimize the distribution of auxiliary random variables Yi and Y2, however in this work we only consider the 
specific Gaussian distribution given above, which yields relatively simple and easily computable bounds. 
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Clearly we can now apply the first statement in Lemma [STTI to the first three terms in (l25l) . and the second 
statement in Lemma [BTTI to the first three brackets in (l25l) . by choosing appropriate da and dh. For the last 
bracket, let cr'^ = and cr^ = in Lemma ISTl then again the second statement can be applied. 

Any valid choice of di and ^2, i.e., di > d2 > 0, yields a valid lower bound. One could optimize within 
this set of lower bounds to find the tightest one, however without a matching inner bound, solving this rather 
involved optimization problem offers little insight. Instead, we shall choose some specific values, which 
indeed provides insightful results. Without loss of generality we may assume Di > D2 > -D3. Thus di = Di 
and c/2 = D2 are a valid choice, and subsequently we have 

RiDuD2,D3) > i(i?i+i?2 + i?3) 

1, (1 + Difjl + D2){Di + D2f{D2 + ^3)' 

- 12 2''DlDlDl 
(«) 1 1 3 

- 12^°^DfD^-4' ™ 
where (a) is by using the facts 1 + D.^ > 1 and Di + -Dj+i > -Dj for z = 1, 2. 

3.1.3 Comparing the upper and lower bounds 

Combining (|22)) and (l28l) . we have 



The beginning and the end of inequalities differ only by a constant | bit, which provides an approximation for 
the SID-RD function. This result reveals that the simple SR-ULP scheme is surprisingly competitive, since it 
is within | bit of the optimum performance. 

3.2 Approximating the rate-distortion region 
3.2.1 A simple inner bound 

For K = ?), the symmetric MLD coding rate region given in (fT2l) and (fT3l) can be written explicitly in the 
following form by applying the Fourier-Motzkin elimination [21] (see also [15]) 

Ri>Hi, i = 1,2,3, (30) 
Ri + R, > 2Hi + H2, 1^3, {1,2,3}, (31) 

2Ri + Rj + Rk> 4:Hi + 2H2 + Hi, {i, j, k) is a permutation of (1, 2, 3), (32) 

Ri + R2 + R3> 3Hi + ^H2 + Hi. (33) 
where Ei = HiVi) for i = 1,2, 3. 
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Figure 5: Simple inner and outer bounds for 7l{Di, D2, -D3). The gaps between the corresponding planes are 
measured by the Euclidean distance. 



Clearly the achievability of the MLD coding rate region given by (|30l)-(l33l) implies that the following rate 
region is achievable for the MD problem by using the separation scheme illustrated in Fig. [3l 

i?. >^log-^, 2 = 1,2,3, (34) 

Ri + R, >l\og-^, t^j, 2,je {1,2,3}, (35) 

2Ri + Rj + Rk>T: log n ' ^) ^ permutation of (1, 2, 3), (36) 



1 

V"" DID2DI 



Ri + R2 + R3>-\og-^-—^. (37) 



3.2.2 A simple outer bound 

To derive an outer bound to match the template induced by the SR-MLD coding rate region, we need to 
consider bounding the rate combinations of i?j + Rj and 2i?j + Rj + R^, in addition to the sum rate 
Ri + R2 + R-i- Clearly, the first two kinds of combination can be treated similarly as the sum rate, and we 
next show the last kind of rate combination can be appropriately bounded. We start with the following chain 
of inequalities, 

n{2Ri + Rj + Rk) > 2H{Si) + H{Sj) + H{Sk) 
> 2H{Si) + H{Sj) + H{Sk) - H{S,Sj) - E{SSk) 
+ E{SS,) + H^SSk) - H{S,S,Sk) + H{S,S,Sk) 

- [H{Si\Yn + H{Sj\Yn - H{SiS,\Yn] - [H{S,\Yn + H{Sk\Yn - H{S,Sk\Yn] 

- [H{S,S,\Y,^) + H{SA\Y,^) - HiS,S,Sk\Y,^)] - //(^i^s^slX"), (38) 
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where the brackets are nonnegative because /(Si; Sj\Y^), I{Si; SklY^ andI{SiSj] SiSk\Y2) are nonnegative. 
Through some algebra, we arrive at 



and now Lemma [3711 can be applied. By taking di = Di and d2 = D2 and further removing non-essential 
terms as in the sum rate case, an outer bound can be derived; the details are omitted here for brevity. 

3.2.3 Comparing the inner and outer bounds 

With the simple inner and outer bounds, we conclude that the rate-distortion region is sandwiched between 
them as illustrated in Fig. [5l where the gaps between the corresponding planes are measured by the Euclidean 
distance. Note that the bounds given here are looser than those given in Fig. [2l and in later sections we 
will discuss how the tighter bounds are derived. In addition to providing an approximate characterization of 
the R-D region, the result further implies that the simple SR-MLD scheme is in fact not very far away from 
optimality, since it is within a small constant of the outer bound. 

We use this section to illustrate the underlying ideas in the remainder of this paper. The result for the 
general i^-description case given in the later sections are more involved, and we develop the general case 
result not only for the SR-MLD scheme, but also for the PPR multilayer scheme which is not separation- 
based. There are several difficulties in doing so: (1) There is no explicit representation of the inner and 
outer bounds as in the case for K = 3. (2) The PPR multilayer scheme is originally designed only for the 
symmetric -rate case, and we need to "inflate" the single rate point to a rate region. (3) To find tighter bounds, 
the simple choice for the values of di used in this section is not sufficient. We first summarize the main results 
for the general /^-description problem in Section HI then in Section [5] and [6l we shall discuss in more details 
how these difficulties are addressed. 



In this section, we present several theorems which summarize the main results for the Gaussian MD problem. 
The result on approximating the SID-RD function is first given, followed by the rate-distortion region approx- 
imation. More details are given in the Section [5l [6] and the appendices. Since the treatment for general sources 
under the MSE distortion measure is notationally more involved, they are thus delayed to those sections. 

4.1 Approximating the symmetric individual description rate distortion function 

Define the following functions 



(39) 



4 Main results 




(40) 



D. 




a — D, 



a 



- 1 



(41) 




(42) 
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where di > d2 > ■■■ > dx-i > 0, rfo = oo and dK ^ 0; Do = 1 and we take the convention log ^ = 0. For 
convenience we define 

R{D)= sup R{D,d), (43) 

di>d2>...>dK-i>0 



Define the following functions 

$aP) = Y^, a = l,2,...,K. (44) 

For a given distortion vector D = {Di, D2, Dk), we shall associate it with an enhanced distortion vector 
D* = {Dl, D2, D*j^) using a recursive procedure. 



Dl = D^ 



Dl={ ^"-5-" ^-(Da) > ^a-iiDl_,) ^ k = 2,3,...,K. (45) 
Dk otherwise 

This enhanced distortion vector is introduced in order to remove certain cases where the given distortion 
vectors can not be satisfied with equality using the coding schemes we consider; moreover, it has the property 
that it does not significantly effect the lower bound. More details on the enhanced distortion vector are given 
in Section [5lB. We shall also assume Di < 1 for simplicity at this point, but will discuss the cases when 
Dl = 1 shortly. 

We are now ready to present the main theorem of this subsection. 

Theorem 4.1 Let D* be the enhanced distortion vector of D, then the Gaussian SID-RD function under 
symmetric distortion constraints satisfies 

R{D*) > R{D*) > R{D) > R{D) > R{D, d), (46) 

for any di > d2 > ■■■ > dx-i > dx = and do = 00. Moreover, 



K ^ ^ K 



R{D*)~R{D) < i^-l-loga-i^ilog(a-l)4L(ir)<1.48, 



(47) 



a=2 a=2 
K 



R{D*)-R{D) < 



a=2 I- 



1 1 



a — 1 a 



log a = L{K) < 0.92. (48) 



Remark: In Theorem 14.11 we bound the gaps between the inner and outer bounds by universal constants. 
This is not necessary, and we will show in Section [5] that the bounds can in fact be distortion dependent, 
however we relax these bounds to make it universal here. The numerical values are derived using integral 
approximation for series which does not yield the tightest bounds possible. In Table [T] we have included a few 
values of these bounds. 

An important and interesting special case is when only the last several levels have distortion constraints, 
since usually the packet loss probability is not exceedingly high, and for the majority of the time only a 
small number of packets can be lost. Though the universal bound in Theorem 14. II also holds for degenerate 
cases where only certain levels of distortion constraints exist, applying the theorem using the general bound 
R{D, d) can improve the universal constants significantly. In order to do so, the values (c/i, ^2, dK-i) need 
to be chosen carefully. 
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Table 1: Values L{K) and L{K) for K = 1,2,3,..., 8. 



K 


2 


3 


4 


5 


6 


7 


8 


L{K) 


0.5000 


0.7296 


0.8648 


0.9550 


1.0200 


1.0693 


1.1082 


L{K) 


0.2500 


0.3821 


0.4654 


0.5235 


0.5665 


0.6000 


0.6268 



Corollary 4.1 For the Gaussian source, when only distortion constraints Dx^k+i, DK-k+2, Dk exist for 
k e Ik, (or equivalently Di = = ■■■ = Dx-k = l,)we have 



1^1 1-^1 
R{D*)-R{D)<- J2 ^^°S«-2 E -log(«-l) 

" ■ ~ a=K-k+2 



a=K-k+2 
K 



R{D*)-R{D)<\ J2 



a=K~k+2 



a — 1 a 



log a. 



(49) 



Remark: These bounds are usually significantly tighter than the constants given in Theorem |4.1[ It is easily 
seen that when k is kept fixed and K oo, the gap approaches zero; in fact, in this case even the sum rate 
bounds become asymptotically tight. Corollary 14. II thus implies that the SR-ULP scheme is even more closer 
to optimum, and the benefit of using more complicated schemes is diminishing as the number of description 
increases, when we are guaranteed to receive all but a constant number of descriptions. 



4.2 Approximating the rate-distortion region 

We first define two regions, which are in fact two inner bounds to the Gaussian MD rate region. The first 
region is based on the SR-MLD scheme illustrated in Fig. [3l and now we define this (achievable) region for 
general K > 3. Let iZ{D) be the set of non-negative rate vectors (-Ri, R2, Rk), such that 

K 

R^>Y.'^i^ 1<^<^, (50) 

0=1 

for some > 0, 1 < a < i^, satisfying 

Y,rT^>H\v\{D), WveQk, (51) 

where 

H^{D) = llog^, a = l,2,...,K, (52) 

and Do — 1- It is clearly that since the Gaussian source is successively refinable, the right hand side of (l52l) 
simply gives the rate for each layer in the optimal successive refinement code; (l50l) and (|5T1) are simply the 
counterpart of (fT2l) and (fT3l) . 
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The second region is based on a generalization of the PPR multilayer scheme, the details of which are 
given in Section [6l First let D* be the enhanced distortion vector of D and define the following quantities 

«.iir)^l^o, ^:^j^^ , „ = 2,3,...,A-. (53) 
Let TZ^D*) be the set of non-negative rate vectors (_Ri, R2, Rr), such that 

K 

Ri>Y,<^ ^<^<K, (54) 



a=l 



for some r°' > 0, 1 < a < K, satisfying 

^ rfl >if|t;|(r>*), veQK. (55) 

The following theorem establishes that both 7Z{D) and iZ{D) are inner bounds to the Gaussian MD rate- 
distortion region. 

Theorem 4.2 Let D* be the enhanced distortion vector ofD, 

n{D*) c n{D*) c n{D*) c n{D). (56) 

For K > 3, it is difficult to enumerate the faces of the inner and outer bounds, thus we alternatively seek 
to approximately characterize the bounding planes of the rate-distortion region, defined for any A E R;^ and 
A 7^ 0, as the following function 

RAiD) = min A R (57) 



Define the following function 



^^(Ad) = ^E/a(A)log|i 

a=l ^ 



1 + da){Da + do,-l] 



+ d^^i){D^ + da) 



(58) 



where the function fa{A) is defined in (fT6l) and di > d2 > ■■■ > dx-i > 0, rfo = oo and dx — 0. Define 
further the following function 

Rj^{D)= sup R^{D,d). (59) 

di>d2>...>a!A-_i>0 

The next theorem establishes the upper and lower bounds for the bounding planes of the rate-distortion 
region. Since the rate-distortion region is convex, if the upper and lower bounds for the bounding planes 
coincide, a complete characterization is then available. The upper and lower bounds given in the following 
theorem do not coincide in general, however the gap between them is bounded, yielding an approximate 
characterization of the rate region. 
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Table 2: The values of /q(A) and bounds for K = 



A 


(1,0,0) 


(1,1,0) 


(2,1,1) 


(1,1,1) 


fM) 


1 


2 


4 


3 


f2{A) 





1 


2 


1.5 


II ■ II bydlD 
II ■ II by (1621) 









1 

1 

2V2 


1 

3+2 log 3 

2V6 

2+log 3 
2v^ 


1 

4+3 log 3 

4^3 
3+log3 

AV3 



Theorem 4.3 For the Gaussian source and any A > 0, 

Ef^, UA)H^{D*) > UA)H^{D*) > Ra{D) > R^iD) > R^iD, d) (60) 

for any di > d2 > ... > dx-i > ^, d^ = 00 and dx — 0. Moreover, for any A E M;^ and A ^ 0, 

K K K 

J2 UA)H^{D*) - R^{D) <-J2 fa^M) \oga--J2 fa{A) \og{a - 1) 

0=1 a=2 a=2 

A ^ A A ■ 
< 2^— jrloga — 2^-log(a-l) — log(ir-l), (61) 

a=2 a=2 

and 

K ^ K 

J2UA)Ha{D*)-RA{D) < -Y,[f^_,{A) - UA)]loga 

a=l a=2 

< ^ - -] l°g« + 1(4^ - (62) 

Z ^ — ' a — i a Z K — I 

a=2 

Remark: It is not immediately clear that the outer bound, which is specified in terms of an uncountable 
number of bounding planes indexed by A, is still a polytope as for the case K = ?>. Nevertheless it can indeed 
be shown that when we specialize these bounds for appropriate choice of d, it is an equivalent characterization 
of a polytope. Moreover, the bounds given in (l6Tl) and (l62l) are established using the bound induced by this 
specific choice of d. We shall return to this point with more details in Section l6l 

Remark: Theorem 14. 3 [ which provides approximate characterizations of the rate-distortion region, is given 
in a similar manner as Theorem 14. 11 which provides approximation characterizations of the SID-RD function. 
The second bound in (16T1) and the second bound in (l62l ) are more explicit, whereas the first bounds involve the 
function fa{A) which requires solving an optimization problem. These bounds imply that the gaps between 
the bounding planes of inner and outer bounds is upper-bounded by constants independent of the distortion 
constraints. 

Remark: Whether the polytopic inner bound is a good approximate characterization of the rate region 
does not depend on whether the outer bound is a polytope, but only on how large the gap is between the inner 
and outer bounds. Though for the Gaussian source, the outer bound can be specialized to be a polytope, for 
general sources this does not necessarily hold. Nevertheless, even for general sources, the inner bound, which 
is an approximate characterization of the rate region, is still a polytope. 
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Example for K = 3: Now we apply the result in Theorem 14.31 to the case of i^T = 3. As illustrated in 
Section [3l it suffices to consider the choices of vector A in the following set 

{(1,0,0), (1,1,0), (2, 1,1), (1,1,1)}. (63) 

In Table |2l we list the value of /a(A), which can be easily verified since the a-resolution formulation is a 
linear optimization problem. Using (|6T| ) and (|62l ). it is straightforward to compute the bounds between the 
inner and upper bounds, as shown in the last two rows of Tabled Note that here the distance is normalized in 
terms of Euclidean distance. This improves the result given in Section [3l which was illustrated in Fig. |2l 

5 Sum rate bound and SID-RD function approximation 

In this section, we provide more details on the derivation of results regarding SID-RD function. Some inter- 
mediate results will be given, which may in fact be of interest by themselves when tighter distortion dependent 
bounds are needed. We first introduce more formally two achievable individual description rates, which are 
given in a general form that can also be applied to other sources, then the derivation of the outer bounds is 
discussed. With both the inner and outer bounds, we analyze and bound the gap between them. Finally, we 
extend the results to general sources under the MSE distortion measure. 

5.1 Achievable rate using the SR-ULP scheme 

The SR-MLD coding scheme reduces to the SR-ULP scheme when the rate is also symmetric, i.e., Ri = 
R2 = ... = Rk- For a general source, we have the following theorem. 

Theorem 5.1 For any given set of random variables (Yi, Y2, Y^) jointly distributed with the source X, 
such that there exist deterministic functions : ^ X to satisfy 

m{X,g^iY^,Y2,...,Y^))<D^, « = 1, 2, iT, (64) 

we have 

^ 1 

i?(Di, D2, Dk) < V -I{X- Y^\Y,, ^2, . . . , >^a-i), (65) 

a 

a=l 

where Yq — 0. 

This theorem is a natural consequence of combining the result on successive refinement [13, 14] and the 
property of the MDS codes, and thus the proof is omitted. This theorem is given formally in order to facilitate 
the analysis for general sources. In this work, we consider the following natural distribution often seen in the 
successive refinement problem 

K 

Y^ = X + Y,Ni, a = l,2,...,K (66) 

where Ni ~ A/'(0, af) are mutually independent and also independent of X. For convenience, we denote 
Ylif=a-^i values of variance of, i = 1,2, ...,K are chosen such that the distortion constraint 
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at each level is satisfied with equality when the reconstruction is the linear minimum mean squared error 
estimator (LMMSE), i.e., they are determined by the set of equations 



D^ = ^ ' . a = l,2,...,i^. (67) 

It is clear that there always exists a unique and valid solution for these variances when the distortion constraints 
are given in the natural monotonic order. Through basic algebraic calculation, we arrive at the following 
corollary for /?(£>) defined in (|40|) . 

Corollary 5.1 For the Gaussian source, 

R{D) < R{D). (68) 



5.2 Achievable rate using the PPR multilayer scheme 

In the two-part paper [5] and [6], an achievable symmetric individual rate is given for the symmetric MD 
problem, and the main theorem is quoted below together with a necessary definition. 

Theorem 5.2 ([6] Theorem 2) For any probability distribution 

p{x,{ya,j,a G Ik-iJ e lK},yK) = Pix)p{{ya,j, a G Ik~i,J e lK},yK\x), (69) 
where p{{yaj, a E Ir-i^J ^ ^k}, VkI^) is symmetric over X x y^^^^^)+'^ and a set of decoding functions 

gv ■■ 3^'''"''' ^X, vE Qk, \v\ < K, 

g^,yKiK-i)+i^X, \v\=K, (70) 

such that 

E{d{X,gv{Ya,j,a E I\v\d G Gv))) < D\v\, v E Qr, \v\ < K, 

E{diX,gvi{Y^,j,aE I\v\,J G Gv},Yk))) < Dr, \v\ = K, (71) 

the following symmetric individual description rate is achievable 

R = -HiYajJ e Ia\Y^,j,i E Ia-l,J G /«) 

-I ^ 

+ ]-H{Yr\Y,^,,i E Ir.i,j E Ir) - ^H{{Y,,„t E Ir-,,j E Ir},Yr\X). (72) 

A symmetric distribution is defined in [6] as follows. 

Defintion 5.1 A joint distribution p{{yaj, a E Ir-i-,] E lR}^yR\x) is called symmetric if for all 1 < rii < K 
where i E Ir^i, the following is true: the joint distribution ofYR and all (ni + n2 + ... + ur^i) random 
variables where any ria are chosen from the set {^0,1, Va,2, Y^^r}, conditioned on X, is the same. 
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Intuitively, the PPR multilayer scheme provides layered information in the descriptions, and the a-th layer 
information can only be decoded when at least a descriptions are available. The encoding auxiliary random 
variable is essentially the information provided in the j-th description for the a-th layer. In [5] and [6], a 
clever scheme of organizing the information is given, resulting in the achievable rate given in Theorem 15.21 

We notice that the definition in Definition 15.11 is however unnecessarily restrictive and can be straight- 
forwardly relaxed. The following alternative definition of a symmetric distribution can replace the more 
restrictive one. This relaxed version of symmetric distribution will be useful since our choice of distribution, 
which provides simplification in computing the inner bound, is in this relaxed set, but not in the original more 
restrictive set. 

Defintion 5.2 A joint distribution -pi^ya^j^ a G Ik~1i3 ^ lK},yK\x) is called generalized symmetric if for 
any permutation n{-) : Ik Ik, the joint distribution p{{ya,T,(^j), a G lK-i,j G is the same as 

p{{ya,j,a G Ik-iJ G lK},yK\x). 

It is straightforward to check that Theorem |5.3| holds true, when we replace the requirement of symmetric 
distribution with the generalized version. The original version of symmetric distribution essentially requires 
the distribution to be invariant under K — 1 different permutations vr^ ( • ) , one for each layer a = 1,2, K — 1; 
i.e., if we permute {^1,1, 2, Yi^k}, and then permute {^2^1, 1^2,2, Ys,k} differently, and so on for each 
a = 1,2, K — 1, the resulting distribution should remain the same as the one before such permutations. 
This requirement was however never completely utilized in the coding scheme. Instead the coding scheme in 
fact only requires invariance under a single permutation 7r(-) which is applied to all the levels simultaneously, 
i.e., TTai') = 7r(-), for a = 1, 2, i^' — 1. More formally, we state this generalized result as a theorem. 

Theorem 5.3 The statement of Theorem 15.21 holds when the symmetric distribution requirement is replaced 
with the generalized symmetric distributions. 

From Theorem 15. 3[ an achievable individual description rate can be derived by choosing a specific set 
of encoding auxiliary random variables, and more specifically, we shall choose the following set of random 
variables. Let 

K-l 

Y^^k=X + Y,N^,k, a = l,2,...,K-l, k = l,2,...,K (73) 

i=a 

where A^j ^ are mutually independent zero-mean Gaussian random variables, which are also independent of 
X. Their variances are denoted as afj^, and they satisfy af^^ = aff^, for any k, k' G Ik', we thus denote ^ as 
af. For convenience, we shall denote J2^^a as Za^f For the last layer, i.e., a = K,we use 

Yk = X- E(X|y;,,, a G Ik^u kelK) + Nk, (74) 

where A^^^: is a zero-mean Gaussian random variable independent of everything else, with variance o\. Clearly, 
X— E(X|FQ, fc,a G lK-i,k G Jx) is the innovation of X given all the lower- level random variables. It remains 
to specify the variances of {{Na^k, ol G Ik-i-, k G iV^-}, which is in fact not trivial as we shall discuss 
next. Notice that for all the layers except that a = K-th layer, Ya-ij ^ Yaj ^ X is a Markov string, thus the 
lower layers are useless when higher layers are decoded. To see that this choice of encoding auxiliary random 
variables does not satisfy the original symmetric distribution requirement, consider the joint distribution of 
(Yi 1, 12,1) and that of (^1,1, ^2,2)- Given X, the first pair of random variables are dependent, while the second 
pair of random variables are independent; this clearly violates the original symmetric distribution requirement. 
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The key difficulty we face is now the following: when descriptions in the set Gv, where \v\ = a — 1 < 
K — 1, are received, the decoding function can reconstruct the source using the random variable {Yq,_i i G 
Gv}', note that from (fTSl) . it is clear that since Ya-2,k = ^Q-i,fc + Na^2,k, using only {Ya-i,i,i G Gv} to 
reconstruct the source does not lose optimality, i.e., the lower layer random variables are useless given the 
higher layers. If one more description, say the j-th description, is further received, the decoding function 
now can utilize the random variable associated with this description Ya-ij. Thus even if the a-th layer 
random variables {Ya^i, i E Gv U {j}} do not provide additional information beyond the lower layer random 
variables {Ya~i,i,i E Gv U {j}}, the decoder is still able to improve the reconstruction over the original 
decoding function with descriptions in Gv- This is in fact a key observation in [5] that improves the system 
performance over the simple SR-ULP scheme. This observation implies that for certain distortion vector D, 
it is not possible to satisfy all the constraints with equalities with the PPR multilayer scheme because some 
constraints are too loose, and thus the distortion region has some degenerate regimes. The enhanced distortion 
vector given in Section |4] is thus introduced to eliminate this effect. This enhanced distortion vector D* 
serves a similar role as the enhanced channel in [19], where the MIMO Gaussian broadcast channel capacity 
is established. 

The enhanced distortion vector D* has the following three important properties: 

• Enhancement: D* enhanced the distortion vector D, i.e., D* < Di, i = 1,2, ...K. 

• Mono tonicity: D* = (D*, D2, D*^) is a monotonically decreasing sequence, thus a valid distortion 
vector. 

• $-monotonicity: it satisfies the condition 

^a{Dl) < <I>,_iP:_i), a = 2,3,...,K. (75) 

These properties are straightforward to check by the construction of Dl. 

The $-monotonicity property is exactly the condition being checked in the definition of the enhanced 
distortion vector, with replacing Da- Thus the definition of the enhanced distortion vector effectively 
constructs a new distortion vector in a sequential manner, if the original distortion vector does not satisfy 
the $-monotonicity property. The desired $-monotonicity property removes the degenerate regimes and the 
corresponding difficulty previously discussed. To see this, consider the following two cases: (1) when de- 
scriptions in Gu are received, where \u\ = a; (2) when descriptions in Gv are received, where v = a — 1 
and Gv ^ Gu- For the latter, using the given Gaussian auxiliary random variables {Ya-i^i, i E Gv}, linear 
estimation induces a distortion 

= -• (76) 

Similarly, using the random variables {F^ j, i E Gu}, linear estimation induces 

In the case that each individual encoding auxiliary random variable Yaj does refine over Ya-ij, i.e., there is 
no explicit information embedded in the a-th layer, we have cr'^^i = 0, i.e., -D^-i given by 
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Now suppose the distortion constraint at the (a — l)-th level is given by Da-i = D'^-i as in (TTSl) . then the 
degenerate case previously discussed indeed occurs if the distortion constraint at the a-th level is given such 
that Da > D'^. Through elementary algebra, it is clear that this is equivalent to the condition 

aDa ^ (g - ^^^^ 



I- Da 1 - Da-i 

which is exactly the negation of (l75l) . with {D^-i, Da) replacing (-D*_i, -D* ). 

Thus if the condition (fTSl) does not hold for the given distortion constraints Da-i and Da, our choice of 
Gaussian encoding auxiliary random variables will not be able to achieve the given (Da-i, Da) simultaneously 
with equality, but can naturally achieve strictly better distortions with equality. For the enhanced distortion 
vector {Dl, D^, D^), which indeed satisfies the condition (1751) . the distortion constraints can always be sat- 
isfied with equality in this achievability scheme, by choosing the appropiate variances {af, cr|, cr|;). Con- 
versely, given an enhanced distortion vector D*, the variances of the auxiliary random variables {{Na,k, ol E 
Ik~i, k G Ik}, N^} are uniquely determined. More precisely, the variances for cr^, a = 1, 2, K — 1 are 
determined by 

K-l 



Y,^-='^a{D*a), (80) 



which always give a set of valid choices of the variances. Thus from here on, in the PPR multilayer coding 
scheme, the Gaussian auxiliary random variables will be assumed to have the variances thus determined. 

With the enhanced distortion vector D* properly defined, we have the following corollary, the proof of 
which is given in Appendix [9l 

Corollary 5.2 For the Gaussian source, 

R{D) < R{D*) < R{D*). (81) 
The first inequality is clearly true because D* enhances D. 

5.3 Lower bounding the sum rate 

Next we generalize the lower bounding derivation given in Section[3]for 7^ = 3 to the case of general K. The 
generalization is notationally involved, and the result is summarized in the following theorem. 

Theorem 5.4 For the Gaussian source, the sum rate under the K-description symmetric distortion satisfies 



_ „ , + da)' 

1=1 a=l ^ ' ^ ' 

where di > d2> ■■■ > dx-i > are arbitrary non-negative values, do ^ oo and dx — 0. 

Proof 2 The bounding technique extends the method used in [2, 8, 9], however with the new ingredient that 
we expand the probability space with more than one additional random variables, and then utilize the special 
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structure in the expanded probability space to bound the sum rate. We have the following chain of inequalities 

K K 

+ e) > 5^ H{S;) - H{S„ I e 

i=l 

K ^ . „ . K 



n 



1=1 

K-l 



(a) 



Q = l 



+ H{Si, iEIk)- H{S„ I e Ik\X'' 



>IiSi,ielK;X^ 

K-l 



a=l 
K-l 

-E 



a 



K 



K 



K 



^ia) Gy.\V\=a 



J2 H{S,,zeGv\Y:) 



+1/ Gy.\V\=a+l 

K 



K-l 

J2 I{S.,teGv;K 

Gv\V\=a 

K-l 



I" ^ '-JKa+lJ Gy:\V\=a+l 



+ E 

a=l 



K 



l« ^ "-IKa+l) Gv-\V\=a+l 



K 



1=1 



(K 

a=2 G^:|f| 



^ e Ji,; X") - z G Ik; Y^_, 



(83) 



where (a) is by adding and subtracting the same terms where the positive term in the bracket chases the 
negative one; (b) is true because the subtracted bracket is nonnegative due to conditional version of Han's 
inequality [22]; {Ya, a G Ik} cire defined in ^6E\) . though here we are not using them to construct codes. For 
convenience we denote da = J2f=a ^j- apply Lemma \3.1\ on dSH) to get the desired result by 

noticing 



Di + do , Di + oo 
log -; r~ = log ~ = 0' 



l + do 



1 + oo 



(84) 



with the convention log — = 0. 



Note that the lower bound in Theorem 15.41 is in fact a set of lower bounds, parametrized by di > d2 > 
... > dK-i > 0. We may optimize it to find the tightest lower bound, however, an explicit optimization is not 
only difficult, but also fails to offer much insight due to the lack of matching achievability result. Instead we 
shall choose a specific set of values to get a (sub-optimal) bound, resulting in the following corollary. 
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Corollary 5.3 For the Gaussian source, the SID -RD function under symmetric distortion constraints D sat- 
isfies 

^ K ^ n* ^ ^ ^ 1^1 
> iyiiog%i_iy^loga + -y-log(a-l), (85) 

a=l " 0=2 a=2 

where D* is the enhanced distortion vector ofD. 

This corollary is proved in Appendix \T0\ It is worth noting that the left hand side of (|85l) is regarding 
the SID-RD function of distortion vector D, and the right hand sides of (l85l) are only related to the enhanced 
distortion vector. Indeed the enhanced distortion vector is given in such a way that it does not change the 
lower bound under the chosen value of {di, d2, rfj^-i). 



5.4 Bounding the gap between lower and upper bounds 

Now it is rather straightforward to prove Theorem 14.11 Since D* enhances D, we have by Corollary 15. II and 
Corollary 1221 that 

R{D) < R{D*) < R{D*) < R{D*). (86) 

Now combining (|86l ) with Theorem 15.41 and Corollary 15.31 gives Theorem 14. II 

Theorem 14. 1 1 provides one possible approximation for the SID-RD function with universal constant bit 
bound. Various improvements can be made, for example, better choice of {di, ^2, dx-i) and better choice 
of random variables in the PPR multilayer coding scheme. Moreover, when proving Corollary 15.31 we have 
omitted many terms, which may make the bound looser. In fact, for the case with only two level distortion 
constraints, the outer bound in Theorem 15 .41 reduces correctly to the one given in [8] and [9]. It was shown in 
[2], [8] and [9] that for certain cases this bound is indeed tight, which however requires optimization to find 
the optimal bound. We will not pursue such refinements here, but leave them to interested readers. 

In order to prove Corollary 14.11 notice that this case implies we can choose Di = D2 = ... = Dk-u = 1, 
and furthermore we can set dx-k = 00. Thus the lower bound R{D, d) implies that 

R(U)>1 f i,ogii±M^i±%4. (87) 

a=K—k+l 

Apply the procedure of computing the enhanced distortion vector on {Dx-k+i, DK~k+2, ■■,Dk) only, and 
denote the output as {D*j^_j^_^_^, D*j^_^_^2-> ■■■■> D*k)- We then follow the proof of Corollary 15. 3l and arrive at 

1^1, D* 1 ^ 1 , , _ . 1 ^ 1 



a=K-k+l " a=K-k+2 a=K-k+2 

>l E E ^loga + i X: -M«-l). (88) 

2 tr^ a D* 2 f-^ a — 1 2 f-^ a 



a=K-k+l " a=K-k+2 a=K-k+2 



It is clear that Ha = for a = 1,2, K — k. Thus we have proved the bound for the differences between 
the upper and lower bounds as given in Corollary 14.11 



25 



5.5 Extension to general sources 



In this subsection we generalize the resuk for the Gaussian source to other sources under the MSE distortion 
measure, and show similar but looser bounds hold for the symmetric individual description rate under the 
quadratic distortion measure. We derive the result using the SR-ULP scheme, but not the PPR multilayer 
scheme, which appears difficult to analyze for general sources. Interestingly, for i^' = 2 and the symmetric 
distortion constraints, the sum rate gap between the upper bound derived using the SR-ULP scheme and 
the R-D function is upper-bounded by 1.5 bits, which is the same value as that derived in [23] for the two 
description case; nevertheless our result is a stronger, since in [23] the achievable scheme is more involved 
than the SR-ULP scheme yet the bounding constant is the same. 

Some additional definitions are necessary. For a general source X with finite differential entropy, zero 
mean and unit variance, define the following quantity, 

R'{D) = y2-I{X;Y^\Y,,Y2,...,Y^-i) (89) 

where random variable Ya, a = 1,2, K are defined as in (|66l ) and (|67l) . 
The following theorem is the main result of this section. 

Theorem 5.5 For any general source X with unit variance under the MSE distortion measure, we have 

R'{D) > R{D), (90) 



moreover, 

K 



R'{D)-R{D)<J2^- (91) 



a=l 



This theorem essentially states the the SR-ULP scheme with the additive Gaussian codebook operates 
within X]a=i(2c^)^ of the optimal coding scheme, in terms of individual description rate, for any source with 
unit variance. The first statement in the theorem is trivial by applying Theorem 15. 1[ and the second statement 
is proved in Appendix [TT] 

Unlike Theorem 14. 1[ there is no explicit lower bound on the SID-RD function. Indeed, in the proof of 
Theorem |5.5[ the outer bound is never explicitly written to have an single letter form or an analytical form that 
can be computed directly. The key proof idea is to construct the lower and upper bound in appropriate forms 
such that certain terms are the same, and then cancel these terms to bound the remaining terms. 



6 Rate-distortion region approximation 

In this section, we develop the results further to provide an approximate characterization for the MD rate- 
distortion region. The main difficulties are as follows. Firstly, the PPR multilayer scheme was originally 
designed for the symmetric rate only instead for an achievable rate region, and thus certain generalization 
has to be introduced to "inflate" it to a rate region. We apply the a-resolution method to assert that the 
achievability of the comer points of a region which matches the polytopic template of the MLD rate region. 
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and therefore by a time-sharing argument provide an achievable region. The second difficulty is in generalizing 
the sum-rate lower bounding technique to other rate combinations. The terms in deriving the lower bound are 
well- structured for the sum rate case, however for general rate combinations the terms lack such structure. 
Unlike the case K = 3, there is no explicit method to enumerate the appropriate rate combinations, i.e., the 
bounding faces of the rate regions. To overcome this difficulty, we combine the a-resolution method with the 
sum rate lower bounding technique to provide the outer bound, or rather the lower bound for the bounding 
planes of the rate region. 

6.1 Achievable rate-distortion region by the SR-MLD scheme 

Parallel to the SID-RD case, we give a general definition of the rate region not necessary using a Gaussian 
codebook, which is based on the SR-MLD coding scheme illustrated in Fig. [3l Let 7t{Y) be the set of 
non-negative rate vectors -R2, Rk), such that 

K 

R^>Y.<^ 1<^<^, (92) 
a=l 

for some rf > 0, 1 < a < fC, satisfying 

5^ rfl >/(X;l^t;||Fi,F2,...,y|i;hi), ^ive^K. (93) 

We have slightly abused the notation in the above definitions by letting the argument of 7^(-) be a fixed set of 
random variables rather than a set of distortion constraints; this however does not cause much confusion due 
to their apparent difference. 

Theorem 6.1 We have 

conviiliY)) C TZ{D), (94) 

where conv{-) is the convex hull operator, and it is taken over the set of auxiliary random variables Y = 
(Yi, Y2, Yk) in some alphabets x 3^2 x ••• x which are jointly distributed with X, such that there 
exist deterministic functions (yfQ, : x 3^2 x ••• x to satisfy 

Ed{X,go.iYuY2,...,Y^)) <D^, a = l,2,...,K. (95) 

By choosing the auxiliary random variables Ya, a = 1,2, K as specified by (l66l) and (l67l) . it is clear that 
iZ{D) is a (proper) subset of conv(7^(l^)), and thus an achievable region. Note that the region con\(iZ(Y)) 
may be a general convex region with curvy boundary, thus not a polytope. However Tt{D) is a subset of 
this set by specializing it to a particular distribution, resulting in a polytope0. Interestingly it is not a contra- 
polymatroid as often encountered in multiuser information theory. A contra-polymatroid is usually defined as 
a mapping from subsets (of the rate indices) to a non-negative real number. However, here even for the three 
description case, there are four mappings associated with the set of all three rates, one bounding Ri + R2 + R3, 
and the other three on the form of 2Ri + Rj + Rk, thus does not result in a valid mapping. As such, the theory 
characterizing the vertex points of a contra-polymatroid does not offer simplification in the MD problem. The 
a-resolution method invented in [16] is one approach to address this difficulty. 

^It is the projection of the polytope (i?i, i?2, Rk, {rf, a G lK,i G -^a'}) on the first K components. 
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6.2 Achievable rate-distortion region by the PPR multilayer scheme 

In this subsection, we first briefly describe the PPR multilayer coding scheme, and then discuss the diffi- 
culties in generalizing this achievable symmetric rate result to an achievable rate region. To overcome these 
difficulties, we combine the a-resolution method with an additional coding step to provide such an rate region. 

The PPR multilayer coding scheme can be described roughly as follows. At layer a, a E Ir-i and for 
any description k G Ik, codebooks of size 2"^" '- are generated using the marginal distribution of Fq, ^. The 
rate ^ should be sufficiently large such that for any source codeword, with high probability there exist 
codewords in the codebook (a,k), a G Ik-i and k E Ik that are jointly typical with it. This can be done if 
we choose 

R'^^, > h{Y^,i) - ^h{Y^,k, k G Ik\X, {Yj^k,J e k G Ik})- (96) 

Though there is no requirement that R'^f. = R'^ j^, for any k ^ k' intentionally make them equal to simplify 
the resulting achievable region; i.e., we choose 

K,k = HYa,!) - ^HY^^k, k G Ik\X, {Y^^k,j G la-i, k G Ik}) + 6, (97) 

for an arbitrarily small but positive S. Next codewords in a codebook are randomly and independently assigned 
into a total of 2"^° '= bins, a G Ik-i and k G Ik- At the decoder, with any k* descriptions such that k* G Ik~i, 
the first k* layers are decoded. More precisely, the decoder receives descriptions in Gv, such that \v\ = k*; 
if there exists a unique set of codewords {y" j, a G Ik*,j G Gv}, in the corresponding bins that are jointly 
typical, then the decoder reconstructs using the single-letter decoding function gvi')', otherwise a decoding 
failure occurs. To succeed with high probability for any k* G Ik-i, the rates Ra,k, ce G Ik-i and k G Ik, 
only need to satisfy 

0<Ra,j<K,j, a ^ Ik-i, 3^ Ik- (98) 

and 

{R'^ j - R^j) < a/i(y„,i) - h{Ya,i, i G LlYkj, k G j G /„), (99) 
for all V G ^Ik such that = a, and for all a G Ik-i- Rewriting (|99l ), we have 

Yk,ji k G Ia-l,j G la) 

O. 

= a/i(y„,i) - —h{Ya,k, k G Ik\X, {Yj^k,J G L-u k G Ik}) + aS 
K 

-ah{Ya,i) + h(Ya,i,i G lalYkj, k G Ia-i,j G la) 

= h(Ya,i,i G Ia\Yk,j, k G Ia-l,j G /„) 

-—h(Ya,k, k G Ik\X, {Yj^k,J G I^-i, k G Ik}) + aS. (100) 
A 

The last layer codebook is generated using the more conventional method, i.e., the conditional codebook, and 
the following condition is sufficient 

K 

Y,RK,k > IiX-YK\Y^,k,ae lK-i,ke Ik)- (101) 

k=l 
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By collecting the constraints on non-negative rates Raj in (|98] ). (1 1001) and (IIOII) . and defining _Rfc = 
Z]f=i we can already form an achievable region. However, the upper bound in (|981) introduces additional 
difficulty when comparing to the outer bound derived in the next section, and thus it will be desirable to 
remove this condition. In other words, with these constraints taken into consideration, it is not clear whether 
the resulting region matches the polytopic template of the MLD coding rate region. Next we define a similar 
region, and prove this region is indeed achievable and can be written in a form with the same structure as the 
desired template. In [25], we gave a different scheme by using orthogonal binning, however we believe the 
scheme given below is more straightforward. We first introduce a few more notations. 

For a fixed set of (generalized symmetric) auxiliary random variables {{Ya^k, a G Ir-i, k G Ir}, Yr}, 
define the following quantities for a G Ir-i, 

a 

Ha{Y) = h{Ya,i,i G Ia\Ykj, k G la-uj G 4) - —h{Ya,k, k G Ik\X, {F^-fc, j G k G Ir}), (102) 

A 

and 

Hr{Y) = I{X; YR\Ya,k, a G Ir-u k E Ir). (103) 
Let '^(1^) be the set of non-negative rate vectors i?2, Rr), such that 

R 

Ri>^K^ l<t<K, (104) 

a=l 

for some r"" > 0, 1 < a < K, satisfying 

J2rf^>H\v\{Y), veQr. (105) 

Note here in fact the set of auxiliary random variables has more than K components, however we still write it 
as Y for conciseness; we also slightly abuse the notation by letting Ha{ ) have either the enhanced distortion 
vector D* or a set of random variables Y as the argument, which is indeed justified as we shall show that they 
are in fact the same by appropriate choice of the random variables Y. The region TZiY) is the rate region 
satisfying (|100l) . (IIOII) . and the lower bounds in (|98] ), but not necessarily satisfying the upper bounds in (|98]) . 
Thus for a fixed set of random variables {{Yq, fc, a G Ir-i, k G Ir}, Yr} and the specific choice of R'^ ^, the 
achievable region directly implied by the PPR multilayer scheme, i.e., the one by collecting the constraints on 
non-negative rates Raj in (|98] ). (|100l) and (IIOII) . is a subset of TZ{Y). We now state the following theorem. 

Theorem 6.2 We have 

conv{n{Y)) C n{D), (106) 

where the convex hull operator is taken over the set of generalized symmetric auxiliary random variables 
{{Ya^k, « E Ir-1, k G Ir}, Yr} in the alphabets yf x x ... x y^_i x yR, which are jointly distributed 
with X, such that there exist deterministic functions : x x ... x A", a G Ir-i such that 

m{X,ga{{Yi^k,ieIa,keIa}))<Da, a = 1,2, K - I, (107) 

and gR : yf" x 3^|^ x ... x y§^i x yj^ X, such that 

m{X, gR{{Yi^k. iela,ke la}, Yr)) < Dr. (108) 
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The proof of Theorem I6.2l relies on a result in [16], which is quoted below. Let "■" denote the usual inner 
product in the Euclidean space. Let TZ*{Y) be the set of all > such that for all A eR^hutA^O 

K 

A-R>Y,UA)H^{Y), (109) 

a=l 

where is defined in (fT6l ) of Section[2lC. 

Theorem 6.3 ([16] Theorem 2) 

n{Y) = n*{Y). (110) 

Remark: In the definition of TZ*(Y), the requirement that R > can be safely removed without loss of 
generality when HaiY) > 0. To see this, let A = (1, 0, 0), then (11091) reduces to Ri > Hi(Y) by applying 
Lemma [Z4l 

In order to prove Theorem 16. 21 consider the following. For a fixed set of (generalized symmetric) random 
variables {{Ya^k, ol E Ik-i, k G Ir}, Yk}, since both TZ{D) and 'JZ{Y) are convex, they can be characterized 
by the bounding planes. As such if we can prove that for any A E M;^ and A ^ 0, the following inequality 
holds 

min A R< min A R, (111) 
then it follows that the region TZiY) is an achievable region. By Theorem l6.3[ we have 

K 

min A-R = y^UA)H^{Y). (112) 

Thus it suffices to prove that there always exists a rate vector in the achievable rate region that satisfy (|112|) 
with equality, i.e., there exists R E Tl{D) such that 

K 

A-R=Y,UA)Ha{Y), (113) 

«=i 

for any A E and A 7^ 0. This would imply (|1 1 II) . which further implies the claimed result. We prove 
(II 131) and subsequently Theorem |6.2| in Appendix [T2i 

Notice that the region R{D) is just R(Y) with {{Y^^k, « e Ik~i, k E //<}, Yk} defined by (|73]) and (|74| ), 
the variances of which are given by (|80l) . Since for this specific choice of random variables, the values of 
Ha{Y) = Ha{D*), a = l,2,...,Kare given in the proof of Corollary 15.21 in Appendix |9l the following 
corollary is now straightforward. 

Corollary 6.1 Let D* be the enhanced distortion vector ofD, then for the Gaussian source 

n{D*) cn{D*) cn{D). (ii4) 
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6.3 Outer bounding the rate-distortion region 

In this subsection, we provide a lower bound to the bounding plane of the Gaussian MD rate-distortion region. 



Theorem 6.4 For the Gaussian source and any A > 0, 

K , K 



(115) 



where the function fa{A) is defined in n6\) . di > d2 > ... > dx-i > are arbitrary non-negative values, 
do = oo and dx — 0. 



Proof 3 Recall the result in Lemma and consider the following inequalities, 



K 



K 



n 



i=l 



i=l 



Let ci,C2, ...,ck be a set of a-resolution as defined in Theorem \2.1\ Then we can write 



K 



K-l 



0=1 



^ Co^{v)H{S,,ieGv)- Yl c^+i{v)H{S,,ieGv) 

Gv:\V\=a Gi,:\V\=a+l 



>A^,J{S,,telK;X' 



K-l 

a=l 
K-l 

-E 

a=l 



Ca{v)H{Si,i eGv) - Yl Ca+i{v)H{Si,i eGv] 

Gi,-\V\=a Gx,:\V\=a+l 



Y cMHiS^ieGvlY^)- Y c^+iiv)H{S„ieGv\Y: 

Gv.\V\=a Gv-\V\=a+l 



A^J{S,,ielK\X'') 



K-l 



Y c,(«)/(5„^gG«;F,")- Y c^+iW{S,,ieGv;Y:) 

G^:|1;|=q: G^:|t;|=Q+l 
K-l 

+ A^^ [J(5„ I G Ik; X") - I G Ik; , 



K 



1=1 



(116) 



(117) 



where (a) is by adding and subtracting the same terms, and due to the fact that Si,i G Ik cire deterministic 
functions of X^, and (b) is by a conditional version of the covering property of the given sequence of the 
optimal a-resolutions as defined in f T/gl) . At this point, the expression is quite similar to ^3^ . and we can 
apply Lemma \3J\ to complete the proof. 
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Parallel to the sum rate case, we can specialize the lower bound by choosing the values of di,d2, ...,dK~i- 



Corollary 6.2 For the Gaussian source, we have 

K K K K 

J2 ^^^^ > 2 S ^"^^^ ~ 2 5^ ^"-^^^^ " ^""^^ + 2 5^ ^"^^^ ~ ^^^^^ 

i=l a=l a=2 a=2 

. K K K 

> Y,UA)\og^--Y, /a-i(A) log a + - 5^ UA) log(a - 1), (119) 



a=l " a=2 a=2 



where D* is the enhanced distortion vector ofD. 



The proof is given Appendix [131 which is along a similar line as the proof of Corollary 15.31 with the 
additional application of Lemma [23] in one step. 

Next we proceed to establish that the outer bound given above is indeed a polytope. More precisely, define 
TZl^D*) to be the set of G M.^ , such that (11181 ) holds for any A e and A 7^ 0. Note that we do not 
require i?, > in this set. The following corollary establishes a polytopic outer bound. 

Corollary 6.3 Let D* be the enhanced distortion vector of D, then 1Zl{D*) Pi M;^ is a polytope such that 

The proof of this corollary is given in Appendix [H The key idea is the following: though we have an 
uncountable number of bounding planes to characterize 1Zl{D*), if there exists a set Sb C 1Zl{D*) with 
finite number of elements, such that for each A, inequality (|1 181) can be satisfied with equality for some 
element in Sr, then 1Zl{D*) is a polytope. The proof given in Appendix [141 proves the existence of such a 
finite set. 



6.4 Bounding the gap between outer and inner bounds 

Now we are ready to prove Theorem 14.2! and Theorem 14. 3 [ which are presented below. 

Proof 4 (Proof of Theorem 14.2! and Theorem 14.31) Theorem \4.2\ is implied by Theorem \6.1\ Theorem \6.2\ (or 
rather Corollary \6.1\l , the fact that D* enhances D, and the fact that for a = 2,3, ...,K 

H.iD*) ^ i log < 1 log (120) 

andHi{D*) = | log 

The first inequality in ^6J]l can be proved by l\119\) and the definition ofH{D*), and invoking Theorem \4.2\ 
Theorem 1(5.31 Theorem \6.4\ and Corollary \6.2\ To prove the first inequality in l^62\l of Theorem 14.31 we again 
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combine Theorem \4.2\ Theorem \6.3\ Theorem \6.4\ Cowllary \6.2\ and notice the following fact 



K 



a=l 

K-1 



. K K K 

a=l " Q=2 a=2 

The first inequality in f l62]) now follows from dTTSj) anJ definition ofH{D*). 
To prove the second inequality in ^61\ . we write 

1 ^ 1 ^ 

2 log«- 2 log(a- 1) 

K-l 



= ^ log2 + i ^ UA)[log{a + 1) - log(a - 1)] - iog(ir - 1) 

a=2 

< ^ iog2 + 1 x; + 1) - - 1)] - ^ iog(^ - 1) 

^ 1 A ^ } A ■ 

< Z^^^^ylog" ^Z^-log(«-l) ^log(^-l), (122) 

where in (a) we use Lemma \2.4\ The second inequality in l^62\) can be proved similarly, and the details are 
omitted. 



6.5 Extension to general sources 

Similar to the SID-RD approximation, we can extend the rate-distortion region approximation technique to 
general sources under the MSE distortion measure. It is clear that the definition of TZ{Y) is not limited to the 
Gaussian source, and denote iZ'{D) as 'R'{Y) with the random variables Y defined as (|66l) and (|67] ). Define 
the following function, 

R'a{D) = min A R (123) 

We have the following theorem. 

Theorem 6.5 For any general source X with unit variance under the MSE distortion measure, we have 

7Z'{D) C n{D), (124) 

moreover, for any A G M;^ and A 7^ 

k^{D)-R^{D)<j2^-^- (125) 

a=l 

The proof follows closely the sum rate case proof for general sources, and we thus omit it here. 
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7 Conclusion 



We provide approximate characterizations of the individual-description R-D function, as well as the achievable 
rate region, for the Gaussian MD problem under symmetric distortion constraints. This is done by combining 
two inter-connected parts: the derivation of a novel outer bound, and careful analysis of achievability schemes 
to generate inner bounds for easy comparison with the outer bound. The outer bound alone, or the inner 
bound alone, will not be able to provide this results, and particular care has to be taken in order to make them 
compatible. A result in a similar vein was recently obtained by Etkin et al. [26] for Gaussian interference 
channel. 

The new lower bound is obtained by generalizing Ozarow's well-known technique, and expand the prob- 
ability space of the original problem by more than one random variables with special structure among them. 
This technique appears to be promising, and we expect to see its application in other difficult multi-terminal 
information-theoretic communication problems. 

The multi-level diversity coding problem, which can be understood as a lossless counterpart of the MD 
problem, shed tremendous light on the geometric structure of the MD rate-distortion region. We use the 
lossless MLD coding rate region as a polytopic template for both inner and outer bounds for the MD rate- 
distortion region. With the increasing complexity of a source coding problem being considered in information 
theory literature, we expect the complexity of its lossless counterpart to increase as well, and the difficulty 
of the corresponding lossless problem becomes an increasingly dominant component of the overall problem. 
In this context, our work can be understood as the first attempt to make explicit connection between the 
lossless source coding problem and its lossy counterpart. It is worth noting in multi-terminal channel coding 
problems, several well-known recent works can be understood as using deterministic models, for example, 
the network coding results in [27], and the deterministic wireless relay channel model in [28]. There exists a 
philosophical connection between the approach taken in this work and the "one-bit" approximation result for 
the Gaussian interference channel in [26], as well as the approximate capacity result for the Gaussian relay 
network [29]. In [29], an approximate characterization was motivated by the insight obtained in studying 
deterministic relay networks [28], which has an analogous role as the lossless multi-level diversity coding 
problem in our work. In both cases, the connection provides useful insight to the coding scheme and outer 
bounding proof technique. We expect in the near future connection between the lossless (deterministic) model 
and their lossy (non-deterministic) counterpart to be made on other information theoretic problems, and the 
approach of using the former as a guideline in treating the latter to be a fruitful path. 



34 



8 Proof of Lemma 



3.1 



Proofs Define Za = Na + A*";, and = Nb. To prove the first statement, we consider the following chain of 
inequalities 

= nh{Ya)-h{Y:\S,,ieGv) 

= nh{Ya)-h{X^ + Z':\S,,ieGv) 

= nhiXa) - /i(X" + - X^l^i, I G Gv) 

(a) 



> nh{Y,) - h{X^ + Z: - Xli,) 

n 

i=l 
n ^ 

5^-log{(27re)E[(X(0 + Z,( 



(fe) , 
> nh{Y^ 



(c) 

> nh{Ya 



i=l 
n 



nh{Ya) - 5^ IT log {27Te){Ed{X{z),Xv{t)) + 4) 



i=l 



where X^ is the reconstruction with descriptions Si,i G Gv, and its i-th position is denoted as Xv{i)- 
The inequality (a) is because conditioning reduces entropy, (b) is because of the chain rule for differential 
entropy and the fact that conditioning reduces entropy, and (c) is because Gaussian distribution maximizes 
the differential entropy for a given second moment. Since log(-) is a concave function, we have 



J2 t; log \{2Tie){m{X{i),Xv{{)) + d^) 



n 



i=l 



<-log (^27reErf(X",X^) + 4). 



And it follows 



n 



I{S,, teGv; 1?) > nh{Ya) - - log ( 27reEt/(X", X^) + 4 



> nh{Ya) - - log ((27re) (D^vi + 4) ) 
--\o ^ + ^° 

~ 2 



which is the first claim. 

To prove the second claim, we write the following 



IiS„ lEGv; n") - ns^, I e Gv;Y:) 
= nh{Yh) - nhiYa) + h{Y:\S,, i e Gv) - h{Y^\Su leGv). 
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For the latter two terms, we have 

h{Y:\S,,i e Gv) - h{Y^\Si,ie Gv) 

h{Y:\S,,ie Gv) - h{Y,^\N:, {S,,i e Gv}) 

i h{Y:\S„t G Gv) - h{Y:\N:,{S,,t e Gv}) 
= I{Y:-N:\S„zeGv), 

where (a) is because is independent ofY^ and {Si,i E Gv}; (b) is by the definition ofYa. Continuing 
along this line, we have 

I{Y:;N:\S,,teGv) 

h{K) - h{N:\x- + n: + G Gv}) 
= h{N:) - h{N:\x- + + n:, x^, {s,, , i e Gv}) 

> h{N:) - h{N:\x- n: + n-) 

(c) " 

> [KNai^)) - h{N,{i)\X{t) - Xv{t) + NS) + N,{i] 

i=l 



J2 HNa{^)■, X{t) - Xv{^) + N,{i) + Na{t)) 



i=l 

n 



(J)y-1^^ m{X{{),Xv{i)) + da 
n . D\v\ + da 

> — lOEf — — , 

- 2 ^ D\v\+di,' 

where (a) is because Na is independent of Si, i G Gv: (b) is because conditioning reduces entropy; (c) is by 
applying the chain rule, and the facts that is an i.i.d. sequence and conditioning reduces entropy; (d) is 
by applying the mutual information game result (see page 263, [22], as well as [24]) that Gaussian noise is 
the worst additive noise under a variance constraint, and taking Na{i) as channel input; finally (e) is due to 
the convexity and monotonicity of log in x G (0, oo) when da > dh> 0. This completes the proof for the 
second claim. 



We note that a similar line of argument was used in [8] to derive a sum rate lower bound for a system with 
two levels of distortion constraints. However, Lemma [STI generalizes that result since there exists only one 
auxiliary random variable in the setting of [8], but there are two auxiliary random variables Ya and in the 
current setting. 



9 Proof of Corollary 15^ 

Proof 6 We first rewrite the rate formula given in Theorem 15.21 For a fixed set of (generalized symmetric) 
auxiliary random variables {{^0,^, « ^ Ik-Ii k G Ik}, Y^}, recall the definion the following quantities for 
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a e Ik-1 

Ha{Y) = h{Y^^,,i G Io\Yj,k,3 e k e Q - —h{Ya,^,i G Ik\X, {Yj^kJ e la-i, k G Ir}), (126) 

and 

Hk{Y) = I{X; Yk\Y^,u, a G Ik-i, k e Ik). (127) 

Then it follows that 

K , K-1 



a — ' a 

a=l a=l 

+ ^h{YK\Yj,k,J e lK-i,ke Ik) - ^K{Yj^k,J G lK-i,ke Ik},Yk\X), (128) 

where the right hand side is the rate expression given in Theorem \5.2\ 

Now for the specific set of random variables defined by dTH) and dSO]), we have for a = 2,3, K — 1 

HaiY) = h{Y^^i,i G Ia\Yj,k.3 ^ L-i, k G /«) 
a 

- -j^h{X + Z^^„i G Ik\X, {X + Zj^k,J e k G Ik}) 

= h(Ya,i,i G Ia\Yj,k,j e Ia-1, k & la) - ^h{Za,i,i G lK\Zj^k,j e Ja-i, A; G Ik) 

A 

= h(Ya,i,i G j G k e la) - h{Za,i,i G /al^j-fe, j G /c G /a) 

(c) 

= h{Ya,i,i G J G k e la) - h{Ya.i.,i G {y^-fc, J G 4-1, A; G 4}) 

= I{Ya,i,i G /„;X|F,-fc,j G 1^,-1, A; G (129) 

where (a) and (c) are because X is independent of Za,i; (b) is because of the chain rule and the fact that Za^i 
is independent of{Za,k, « G Ik, k ^ i}. Because of the Markov string {Yi fc, k G Ik} ^ {Y2,k, k G Ik} ^ 
... ^ {YK-i,k, k G Ik} ^ X, we have 

Ha{D*) = h{X\Ya-l„te la) " h{X\Ya„t G la) = \^^^ T^^W^kf^ ^ (130) 

by the choices of the variances of the Gaussian random variables Na,k- For a = 1 and a = K, it is 
straightforward to verify that 

i?l(0-) = ilogi,, 



Combining A130\) and A131\) we have, 

) = -log- + -5:-log^^-^^ 

a=l ^ «=2 ^ "^J^^ c 

= 2 ^ " 9 > (132) 

which completes the proof by defining Dq = 1. 



2^ ^ D* 2^ a ^ a-1 

a=l " a=2 
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10 Proof of Corollary El 



Proof 7 To facilitate discussion, define the following index set of loose constraints 

CL = {a:D*^< D^}, 



(133) 



and it follows that C£ = Ik\Cl; note that 1 G C£. For a given a E Cl, define N{a) as the index of lower 
neighboring distortion constraint to a that is not loose, i.e., N{a) = maxk<:a,keci k. 

We first consider the case when the distortion vector is given such that it satisfies the conditions 



$a-lPa-l) > <faPa), a = 2,3,...,K, 



(134) 



where we take Dq = 1. Note this implies Da = D*^, a = 1,2, K, and Cl = ^. In this case we choose 
da = ^a{Da), for a = 1,2, K — 1, which is clearly valid. We start from Theorem \5.4\ to show that for the 
specific choice of da, the claims holds. 



K 



> 



i=l 



K 
~2 



f 1, (1 



1 + da){Da + da-l] 



2 ^ a ^ 

a=2 



+ da~l){Da + da) 

Da-l 1 + (a -1)1^0 



a 



Da 



Do 



Dr 



(a-2)Da-i 



K ^ 1 1 



D 



K-l 



a + 1 — Da 
K -1-Dk + 



Dk- 



K^l 



E-L , Da-l . 
- log h 
a 



K^^l 



2 ^ a Da 2 ^ a 

a=l " a=2 



Dk 1 + iK- 2)Dk-i 
1 + (a - l)Da a-l-Da + 



Da 



1 + (a - 2)Da-i a + 1 - Do 



K^l, Da-l ^ K 
— > — log lo 



K-1-Dk + 



Dk- 



1 + {K- 2)Dk-i 



K-l 



2 ^ a 



k'^i 



(«) K 
> 



+-y-\o 

a- 

9 ^ n, 



2 ' " {2-Di 
a — 1 — Da 



-Et- 



1 



a + 1 



log(l + («-l)D, 



Da 
D0.-1 



a 



Da 



2 

a=l 



Da-l , 

log 

Da 2 ^ 



K-l 



{2-D 



1 ^ V- i , 

:7^ + tE-i°' 



2 a 

a=2 



^ " \ +\\og{K-l) 
a + i — L>a 2 



K 
~2 



> 



0=1 a=2 a=2 

K^l, Da-l 1 _^K^1 



2 ^ a 

a=l 



(135) 



a=2 



where in (a) we used J^" ^ > Da, and omitted the third term which is positive. Thus the claim is true if i\134\) 
holds. 
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For the case when U34\i does not hold, then we choose da = ^a{Da), for a E CI as before; however for 
any a G Cl, we choose da = d^ia)- Note that with such a choice, we have da = da-i and therefore, 

(1 + da-l){Da + da) 

If we replace Da with in the left hand side of the above equation, the equality still holds; moreover 

da = <^a{D*a), a G Cl (137) 
Thus by using this particular choice of{di,d2,..-, dx-i), we have 

^ ' " 2 a {1 + da-l){Da + da) 

. i,o,(l±M£a±|^, ,138) 

2 " {^ + da-l){D*^ + da) 

and the exact same derivation holds as in the case when U34\i holds, with replacing Da. Dividing both 
side of( \135\i by K completes the proof of the corollary. 



1 1 Proof of Theorem 5.5 



Proof 8 We pick up the story from for the lower bound and rewrite it slightly differently. 

K K~l 

n5^(i?. + e)>5^— ^ [HS,,zeCv;Y:)-I{S,,zeCv;Y:_,)] 

+ [I{S,, t e Ik; X") - I{S^, I G Ik; V-x-i)] • (139) 



where now the random variables Ya, a = 1,2, K — 1 are defined as in n6m and m7\) . and for simplicity we 
define Yq = 0, i.e., a constant. 

Next we consider the upper bound R'{D), c.f Theorem \6.1\ using the same set of random variables Ya, 



a = 1,2, K as above 



K ^ K 



R'{D) = V -I{X- Ya\Ya-{) = V -[/(X; F„) - /(X; Ya-i)]. (140) 



a — ' a 

a=l a=l 



Note we have used that fact that X ^ Ya ^ Ya~i is a Markov string for any a G Ik- The auxiliary random 
variables used in the lower and upper bounds are in fact the same, and it is clear that this is a valid choice in 
deriving the lower bound by definition. 
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Thus we can now bound the difference between the upper and lower bound on the symmetric individual- 
description rate as follows 

^ 1 1 

^ -i{X; - — [i{S,,, I e Ik; X") - i{S,, I e Ik\ 



nK 



K-l 
a=l 

^ I ^ 

a=l I "Vay Gv.\V\=a 



1 



E l-f(Si.»eG„;y-)-/(S.,'eG^;iT-i)l 



(^) 



G-i;:|V|=a 



n n 



K 



1 



1 



+ - /(X; Yk) - /(X; Yk-i) - -1(3,, i G Ik. X") + -J(5„ 2 G J^; 



(141) 



Now consider an arbitrary a G Ik~i, cind an arbitrary v such that \ v\ = a, it follows that 

/(X; F„) - /(X; - ^I{S„ z e Gv; O + ^I{S„ leGv; C-i) 
= /i(F„)-/i(r,|X)-/i(F,_i) + /i(F,_i|X) 

- h{Y^:\S„i G G^)] + - h{Y:_,\S,,i G G^)] 

n n 

-h{Z^) + + -hiY:\S,,i G G^) - -hiY:_,\S,,ie Gv) 

n n 



{c) 1 

< 2l°g 



2-D„ 



< - 

- 2 



(142) 



where (a) is due to Y^ and Y^_^ are independent squences, (b) holds since Y^-i = Yq, + N^-i, and in (c) we 
used the bounding technique used in the proof of Lemma 1X7] and continued to use the definition of 

K 



a = 1,2,...,K -1, 



(143) 



and finally in (c) we used the fact Da — 



Da 



Da 



< and Da > 0. 



The last term in AMU can be bounded similarly by noticing I{Si, i G Ik] X") > /(Sj, i G Ik] Y^) 

/(X; Yk) - /(X; Yk^i) - -I{Si, i G Ik] X") + i G Ik] Y^_,) 

n n 

< /(X; Yk) - /(X; Yk^i) - i G F;?) + z G r^.J 

= -/i(Z^) + KZk-i) + G Ik) - -h{Y^_^\S,,i G /,,) 

n n 



DK + dK-lj 



1 

< - 
- 2 



(144) 
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where the last step is by aj^ — 

Now summarize all the bounds derived above, we have that 



^ 1 

R{D) - R{D) <J2—, (145) 



2a 

a=l 



which completes the proof. 



12 Proof of Theorem 6.2 



Proof 9 (Proof of Theorem 16.21) Fix a set of (generalized symmetric) randomvariables {{Ya^k, o G lK^i,k G 

Ik}, Yk}- For a given A > 0, let la be the non-negative integer defined in Lemma \2^ for the a-level. For any 
a G Ik, let 

r if l<k<la\ 



OL- 



Ra,k will be the rate assigned to the a-th layer for the k-th description; denote {Ra,i, Ra,2, Ra,K) R. 
It is clear from the original PPR multilayer scheme [6] that if each of the description has rate approximately 
Ha/ a at the a-th level, then any of the a descriptions can guarantees decoding with high probability. How- 
ever, because the first la descriptions are not given any rate for the a-th layer in ( \146\i . this can not be achieved 
directly without proper coding. 

The generalized coding scheme is by combining the original PPR multlayer scheme with proper MDS 
channel codes. The PPR multilayer scheme is still used as the main encoding step, and let us denote the 
codeword (the output index written in a large enough appropriate alphabet) for the a-th level for description 
k as Ca,k, for a G Ik- ^ post-coding packaging step is now added at the a-th layer as follows. The last K — la 
codeword indices are written in the descriptions as in the original scheme. Each of the first la codeword indice 
Ca,h k = 1,2, la is encoded by a (K — la, a — la) MDS code, and each of the resulting codeword (index) 
is written into one of the last K — la description. This results in an additional rate Ha/a{a — la) in each 
description. Note that since < a — 1, the above MDS code rate is always well defined. It is clear that the 
rate of the k-th description, k > la, for the a-th layer is 

Ha Ha , Ha ^^ An\ 

Ra,k = + *L = T, (147) 

a a[a — La) ot- La 

as we claimed. 

At the decoder, suppose k descriptions in the set Gv are available, where \ v\ = k. Consider a specific level 
a G Ik, and the pre-decoding unpackaging procedure is as follows. Suppose Ua of indices in Gv is smaller or 
equal to la, i.e., Ua = \Gv H |. In the remaining k-Ua descriptions, clearly we can recover their respective 
codewords, i.e., Ga,ifor i G Gv \ h^- However, since Ua < la, we have also k — Ua > a — Ua > a — la 
pieces of the MDS encoded Ga,ifor i G which can be correctly decoded by the property of the MDS code. 
Since Gv H Ii^ C J;^, we can recover all Ga,ifor i G Gv- This holds true for all a = 1,2, ...,k, and then the 
main decoding step in the PPR multilayer scheme can be applied. 

We remark here that the decoding can be easily improved, because ifua < la, there is additional informa- 
tion that the main decoding step is not utilizing. However the above simple procedure suffices for proving the 
current theorem. 
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It remains to show rtiiil) is true with the given rate vector, the proof of which follows closely the step in 
[16] for the proof of Theorem \6. 3\ Let {c{v)} be an optimal a-resolution for A. We have 

A R^ = ^ cMiv ■ Rc) + {A - ^ c^{v)v)-R^. 

By Lemma IZTI for any v where \v\ = a such that Ca{v) > 0, Vi = Ifor i = 1,2, la,' moreover, exactly 
a — la of the remaining components are 1 's. Since the first la components of Ra are 's, and the remaining 
components are equal, we have 

H 

V- Rc= (a-la) ^ _° = Ha for w : c«(v) > 0. 



It follows that 



Ca{v){v ■ R^) =HaYl ^-(^) = U^)^a. (148) 



Since 



A- Ca{v)v = A-A (149) 



has zeros in the last K — la components, and R^ has zeros in the complement positions, we have 

(A- Yl • = 0. (150) 

It follows 

A-R^ = fa{A)Ha. (151) 

Summing over a E Ik now completes the proof since Rt = '^a=i 
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13 Proof of Corollary lO 



Proof 10 Follow the proof approach for Corollary \5.3\ however we directly use D* to replace D. Let do 
^a{.DD for a = 1, 2, K — l,we have 



K 



K-l 



i=l 



a=2 



Dt 



Dli2-Dl) 2 



+ -fK{A) log 



D}, 1 + {K- 2)Dl^, 



K 



> -^^(A)log 



D* 1 



A'-l 



£)* 2 



- log(l + (« - 



1 ^ if 

- fo.-i{A) log(a - Dl_,) + - ^ /„(A) log(a - 1) 



a=2 



a=2 



> - ^ /.(A) log ^ - - 5^ log(a - + - 5^ /.(A) log(a - 1), (152) 



Q=l " o=2 o=2 



where (a) is true because J^°' > Da, and in (b) we omitted the second term, because Lemma implies 
/a(A) < ^^/q_i(A) < /q,„i(A). This completes the proof. 



14 Proof of Corollary E31 

Proof 11 Clearly we only need to prove that the set 1Zl{D*) is a polytope. Since log(a — D*^_-^) > Q for 
a > 2, we can construct a set of independent fictitious source Ui,U2, ■■■,Uk, such that 

H{Ua) = \ log(a + 1-DI), a = 1,2,..., K-l, (153) 

and H{Uk) = 0. The MLD coding rate region for this K -source can be equivalently given in two forms, as 
implied by Theorem 1(5.31 with Ha(Y) replaced by H(Ua). Since the rate region of this MLD coding problem 
is clearly a polytope, there exists a finite set of rate vectors, denoted as Sr, such that for any A, there exists at 
least one rate vector {ri,r2, ...,rK) G Sr, such that 

K K 

J2 An = -J2 /-l(^) l0g(« - ^a-l) (154) 
1=1 a=2 

Now define Ri = Ri + r^, i = 1,2, K, and consequently ([773) reduces to the condition that 

K K K 



43 



We can again define a set of fictitious independent sources Wi, W2, Wk, such that 

H{Wa) = log ^ + log(a - 1), a = 2, 3, K, (156) 

a 

and 

H{W,) = log^. (157) 

Now we would like to apply Theorem 1(5.31 to assert ^7551) is in fact a characterization of the MLD coding rate 
region for this source, however one technicality has to be addressed first. Recall that R is not constrained to be 
non-negative, because otherwise R must satisfy the additional constraint R> r, and Theorem \6.3\ can not be 
applied directly. However, by relaxing R to allow negative component, R may have non-positive components, 
which will render Theorem \6.3\ not applicable without the fact given in the remark immediately after Theorem 
1(5.31 With that remark, now by applying Theorem 1(5.31 we see that ^7551) is indeed a characterization of the 
MLD coding rate region for this source. 

Since the MLD coding rate region is a poly tope, there exists a finite set of rate vectors Sp^ such that for any 
A, there exists at least one rate vector i?2, -Ri) € Sj^, such that di55l) is satisfied with equality. Since 
both Sj. and 5^ are finite, it follows that there exists a finite set Sr, such that for any A, there exists at least 
one vector R = R — r E Sr satisfying ( \118\i with equality. This subsequently implies that the set 71l{D*) is 
a polytope, which completes the proof. 
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